docs: fix typo in Lightning Trainer documentation#21801
Closed
mayank-dev-15 wants to merge 845 commits into
Closed
docs: fix typo in Lightning Trainer documentation#21801mayank-dev-15 wants to merge 845 commits into
mayank-dev-15 wants to merge 845 commits into
Conversation
replace pip with uv Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com>
add documentation on multi test dataloaders Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com>
Update timeout and retry configuration Increased timeout and adjusted retry settings.
* Update device assignment logic to support 'mps' accelerator * Add tests for MPS accelerator mixed precision device selection * Refactor MPS mixed precision device selection tests for parameterized input * chlog * Apply suggestions Co-authored-by: Nicki Skafte Detlefsen <skaftenicki@gmail.com> * apply suggestions * Empty-Commit * Empty Commit * Empty-Commit --------- Co-authored-by: jirka <jirka.borovec@seznam.cz> Co-authored-by: Nicki Skafte Detlefsen <skaftenicki@gmail.com> Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com>
…ing-AI#21235) Bumps [mypy](https://github.com/python/mypy) from 1.18.1 to 1.18.2. - [Changelog](https://github.com/python/mypy/blob/master/CHANGELOG.md) - [Commits](python/mypy@v1.18.1...v1.18.2) --- updated-dependencies: - dependency-name: mypy dependency-version: 1.18.2 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Bhimraj Yadav <bhimrajyadav977@gmail.com>
…18.1,<=0.22.2 in /requirements (Lightning-AI#21238) build(deps): update docutils requirement in /requirements --- updated-dependencies: - dependency-name: docutils dependency-version: 0.22.2 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
* simplify lit CI config * Empty-Commit * Empty-Commit * Update --------- Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
…ing-AI#21243) * fix example * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com>
…r subclass model mode (Lightning-AI#21246) * Fix LightningCLI loading of hyperparameters from ckpt_path failing for subclass model mode * Changelog pull number * Update src/lightning/pytorch/cli.py Co-authored-by: Nicki Skafte Detlefsen <skaftenicki@gmail.com> --------- Co-authored-by: Nicki Skafte Detlefsen <skaftenicki@gmail.com> Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com>
…ng-AI#21237) Bumps [click](https://github.com/pallets/click) from 8.2.1 to 8.3.0. - [Release notes](https://github.com/pallets/click/releases) - [Changelog](https://github.com/pallets/click/blob/main/CHANGES.rst) - [Commits](pallets/click@8.2.1...8.3.0) --- updated-dependencies: - dependency-name: click dependency-version: 8.3.0 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Bhimraj Yadav <bhimrajyadav977@gmail.com> Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com> Co-authored-by: Nicki Skafte Detlefsen <skaftenicki@gmail.com>
Update LitServe description for clarity
… to PyTorch Lightning?'
Updated links in README to include UTM parameters for tracking.
readme updates
Update readme
…>=1.12.0,<1.24.0 in /requirements (Lightning-AI#21252) build(deps): update onnxruntime requirement in /requirements Updates the requirements on [onnxruntime](https://github.com/microsoft/onnxruntime) to permit the latest version. - [Release notes](https://github.com/microsoft/onnxruntime/releases) - [Changelog](https://github.com/microsoft/onnxruntime/blob/main/docs/ReleaseManagement.md) - [Commits](microsoft/onnxruntime@v1.12.0...v1.23.0) --- updated-dependencies: - dependency-name: onnxruntime dependency-version: 1.23.0 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
….0 in /requirements (Lightning-AI#21267) build(deps): update ipython[notebook] requirement in /requirements Updates the requirements on [ipython[notebook]](https://github.com/ipython/ipython) to permit the latest version. - [Release notes](https://github.com/ipython/ipython/releases) - [Commits](ipython/ipython@rel-0.8.4...9.6.0) --- updated-dependencies: - dependency-name: ipython[notebook] dependency-version: 9.6.0 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Corrected the spelling of 'pretrai' to 'pretrain' in the README.
* docs: add 'typing.Union' to nitpick_ignore_regex in Sphinx configuration * docs: add Supermicro link to linkcheck_ignore due to 403 error
Bumps [astral-sh/setup-uv](https://github.com/astral-sh/setup-uv) from 6 to 7. - [Release notes](https://github.com/astral-sh/setup-uv/releases) - [Commits](astral-sh/setup-uv@v6...v7) --- updated-dependencies: - dependency-name: astral-sh/setup-uv dependency-version: '7' dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Bhimraj Yadav <bhimrajyadav977@gmail.com>
…ements (Lightning-AI#21283) * build(deps): bump pytest-rerunfailures in /requirements Bumps [pytest-rerunfailures](https://github.com/pytest-dev/pytest-rerunfailures) from 16.0.1 to 16.1. - [Changelog](https://github.com/pytest-dev/pytest-rerunfailures/blob/master/CHANGES.rst) - [Commits](pytest-dev/pytest-rerunfailures@16.0.1...16.1) --- updated-dependencies: - dependency-name: pytest-rerunfailures dependency-version: '16.1' dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> * fix: update pytest-rerunfailures version constraints for Python compatibility --------- Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Bhimraj Yadav <bhimrajyadav977@gmail.com>
* [pre-commit.ci] pre-commit suggestions updates: - [github.com/pre-commit/pre-commit-hooks: v5.0.0 → v6.0.0](pre-commit/pre-commit-hooks@v5.0.0...v6.0.0) - [github.com/astral-sh/ruff-pre-commit: v0.12.2 → v0.13.3](astral-sh/ruff-pre-commit@v0.12.2...v0.13.3) - [github.com/pre-commit/mirrors-prettier: v3.1.0 → v4.0.0-alpha.8](pre-commit/mirrors-prettier@v3.1.0...v4.0.0-alpha.8) * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Replace Prettier repo with custom version --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com> Co-authored-by: Deependu <deependujha21@gmail.com> Co-authored-by: Bhimraj Yadav <bhimrajyadav977@gmail.com>
…ghtning-AI#21234) Bumps [coverage](https://github.com/nedbat/coveragepy) from 7.10.6 to 7.10.7. - [Release notes](https://github.com/nedbat/coveragepy/releases) - [Changelog](https://github.com/nedbat/coveragepy/blob/master/CHANGES.rst) - [Commits](coveragepy/coveragepy@7.10.6...7.10.7) --- updated-dependencies: - dependency-name: coverage dependency-version: 7.10.7 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Bhimraj Yadav <bhimrajyadav977@gmail.com> Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com> Co-authored-by: Nicki Skafte Detlefsen <skaftenicki@gmail.com>
* fix pylint E1120 error. * fix type annotation for py39 * fix mypy error. * fix mypy error. * fix mypy error. * fix wrapper return type. * refactored `_restricted_classmethod_impl` * revert refactor * ignore mypy error. * resume original type. * revert mypy. * attempt fixing macos doctest --------- Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com> Co-authored-by: Nicki Skafte Detlefsen <skaftenicki@gmail.com> Co-authored-by: Justus Schock <12886177+justusschock@users.noreply.github.com>
…g package (Lightning-AI#21275) * update * update --------- Co-authored-by: Bhimraj Yadav <bhimrajyadav977@gmail.com> Co-authored-by: Nicki Skafte Detlefsen <skaftenicki@gmail.com>
Pin official GitHub workflow actions
Pin disabled TPU workflow actions
Pin third-party CI workflow actions
Pin internal workflow actions
Fix Sphinx parameter permalink positioning
* update * update * update * update * update
* ci: temporarily disable litbot checkgroup * update
Restructure the "Synchronize validation and test logging" section in accelerator_prepare.rst into a problem-framing intro plus three subsections (sync_dist, TorchMetrics, manual all_gather), a decision table, and a common-pitfalls list. Directly addresses the custom-metric case: accumulate per-step outputs, call all_gather at epoch end, and compute the metric. The "my compute runs N times" confusion is called out and resolved — after all_gather every rank holds the same data, so the redundant compute is cheap and correct; only self.log needs the rank_zero_only guard. Refs Lightning-AI#20117 Co-authored-by: Deependu <deependujha21@gmail.com>
…ightning-AI#21708) * fix: use del instead of explicit shutdown in CombinedLoader.reset() * Update src/lightning/pytorch/CHANGELOG.md --------- Co-authored-by: Deependu <deependujha21@gmail.com>
…ightning-AI#21709) Bumps [pypa/gh-action-pypi-publish](https://github.com/pypa/gh-action-pypi-publish) from 1.13.0 to 1.14.0. - [Release notes](https://github.com/pypa/gh-action-pypi-publish/releases) - [Commits](pypa/gh-action-pypi-publish@ed0c539...cef2210) --- updated-dependencies: - dependency-name: pypa/gh-action-pypi-publish dependency-version: 1.14.0 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…I#21750) Bumps [astral-sh/setup-uv](https://github.com/astral-sh/setup-uv) from 7.6.0 to 8.1.0. - [Release notes](https://github.com/astral-sh/setup-uv/releases) - [Commits](astral-sh/setup-uv@37802ad...0880764) --- updated-dependencies: - dependency-name: astral-sh/setup-uv dependency-version: 8.1.0 dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
…htning-AI#21686) * Fix toggle_optimizer breaking under torch.compile (Lightning-AI#21513) `LightningModule.toggle_optimizer` and `untoggle_optimizer` mutate `requires_grad` on parameters to implement multi-optimizer gradient masking. Dynamo/AOTAutograd does not support `setattr()` on `Tensor.requires_grad` because it can change a tensor's leaf-ness mid-graph, so when the `LightningModule` is wrapped with `torch.compile` tracing either graph-breaks with "Unsupported: setattr() on Tensor.requires_grad" or raises a `KeyError` on the internal `param_requires_grad_state` mapping when the traced parameter references diverge from those held by `trainer.optimizers`. Decorate both helpers with `@torch.compiler.disable` (the same pattern already used for logging bookkeeping in `logger_connector/result.py`) so they run as opaque Python when called from a compiled `training_step`. Eager behavior is unchanged. Adds a CPU regression test that compiles a two-optimizer `LightningModule` calling `toggle_optimizer` / `untoggle_optimizer` in `training_step` and exercises one training iteration, plus a CHANGELOG entry. * Narrow test_toggle_untoggle to check compiler.disable attribute (Lightning-AI#21513) The previous regression test compiled a `LightningModule` end-to-end and called `self.optimizers()` inside the compiled `training_step`, which unrelated to the toggle_optimizer fix trips a separate Dynamo limitation: tracing `self.trainer.strategy._lightning_optimizers` raises `InternalTorchDynamoError: GetAttrVariable(...) has no type` across all CI platforms and torch versions. The shipped fix — `@torch.compiler.disable` on `toggle_optimizer` / `untoggle_optimizer` — does not require a full compiled trainer run to verify; it only guarantees Dynamo skips those two methods. Replace the integration test with a direct attribute check that both methods carry the `_torchdynamo_disable` marker installed by `torch.compiler.disable`, following the same `has_dynamo(fn)` pattern already used by `tests/utilities/test_compile.py::test_compile_uncompile`. Toggle/untoggle functional correctness remains covered by the existing `test_toggle_untoggle_2_optimizers_no_shared_parameters` and `test_toggle_untoggle_3_optimizers_shared_parameters` tests in this file. --------- Co-authored-by: Deependu <deependujha21@gmail.com>
…g-AI#21707) * feat: add filter_keys to log only specified device stats * update
…c & Trainer (Lightning-AI#21746) * update * update * update * update * update * update * update * update * Apply suggestion from @deependujha
…#21743) * added arguments * Fix Hopper FLOPs inconsistency and add using_sparse_model flag * Update src/lightning/fabric/utilities/throughput.py Co-authored-by: Deependu <deependujha21@gmail.com> * made _CUDA_FLOPS consistent with a*10^b so it is consistent * resolved the reviewers comments * Added changed logs * update * update --------- Co-authored-by: Deependu <deependujha21@gmail.com> Co-authored-by: thomas chaton <thomas@grid.ai>
…1747) * jsonargparse 4.39 required for CLI, check at run-time Signed-off-by: Adam J. Stewart <ajstewart426@gmail.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Adam J. Stewart <ajstewart426@gmail.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
…store (Lightning-AI#21758) * Fix: add weights_only parameter to LearningRateFinder and propagate through LR finder call chain * Test: ensure LearningRateFinder supports weights_only=False during checkpoint restore * Test: ensure LearningRateFinder supports weights_only=False during checkpoint restore and apply pre-commit * Style: apply pre-commit formatting to LR Finder patch * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
…ontract (Lightning-AI#21774) * fix(strategy): pass CPU device to FSDP device_id on CPU Passing `device_id=None` (which is `root_device.index` for CPU) triggers a guard in PyTorch >= 2.5 ("FSDP needs a non-CPU accelerator device"). Passing the actual `torch.device("cpu")` avoids this guard and allows FSDP to run on CPU. - Update `FSDPStrategy` in both Fabric and PyTorch to pass `self.root_device` if it is a CPU device, otherwise `self.root_device.index`. - Add unit tests to verify `device_id` selection for CPU and GPU devices. CONV=516b4463-4a64-48ac-aabe-20fc589f838a TAG=agy * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * chore: update CHANGELOG * refactor(FSDP): address review feedback on device_id CPU fix - common out `device_id` into a local so the debug log reflects the device actually passed (was logging root_device.index = None on CPU) - add call-site comment noting CPU is not a supported FSDP path; the branch only honors the torch>=2.5 device_id contract - replace the parametrized CPU/GPU device_id tests with a single minimal CPU-only assertion per strategy: guards the device_id=None regression without codifying the GPU path as a tested contract - reword CHANGELOG to frame this as a narrow robustness fix * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Potential fix for pull request finding Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * apply copilot suggestion --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Deependu <deependujha21@gmail.com> Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
…Lightning-AI#21776) * Profile dataloader initialization and update profiler docs * docs: update Advanced Profiler example to use run_training_batch * update * update --------- Co-authored-by: deependujha <deependujha21@gmail.com> Co-authored-by: thomas chaton <thomas@grid.ai>
Contributor
|
Tick the box to add this pull request to the merge queue (same as
|
Collaborator
|
Pls rebase, seem you drugged too many unrelated commits... |
Collaborator
|
Thanks for the contribution! Unfortunately, this PR contains hundreds of unrelated commits, making it impractical to review or merge. I'm going to close this PR. Could you please create a fresh branch from the latest |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes two issues in the Trainer documentation:
Grammar fixes only, no code changes.